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Abstract. We build up a directed network tracing links from a given integer 
to its divisors and analyze the properties of the Google matrix of this network. 
The PageRank vector of this matrix is computed numerically and it is shown 
that its probability is inversely proportional to the PageRank index thus being 
similar to the Zipf law and the dependence established for the World Wide Web. 
The spectrum of the Google matrix of integers is characterized by a large gap 
and a relatively small number of nonzero eigenvalues. A simple semi-analytical 
expression for the PageRank of integers is derived that allows to find this vector for 
matrices of billion size. This network provides a new PageRank order of integers. 
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1. Introduction 

The number theory pQ is the fundamental branch of mathematics where the theory 
of prime numbers, besides its beauty, finds important cryptographic applications 0. 
It is established that the methods of the Random Matrix theory and quantum chaos 
find their useful applications for the understanding of properties of prime numbers 
and the Riemann zeros [3l HI |5] ■ 

In this work we propose another matrix approach to the number theory based 
on the Markov chains [6] and the Google matrix [7]. The later finds important 
applications for the information retrieval and Google search engine of the World 
Wide Web (WWW) [S]. The right eigenvector of the Google matrix with the largest 
eigenvalue is known as the PageRank vector. The elements of this vector are non- 
negative and have the meaning of probability to find a random surfer on the network 
nodes. The PageRank algorithm ranks all websites in a decreasing order of components 
of the PageRank vector (see e.g. detailed description at [5]). Here, we propose a 
natural way to construct the Google matrix of positive integers using their division 
properties. We study the statistical properties of the PageRank vector of this matrix 
and discuss the properties of a new order of integers given by this ranking. The 
properties of the eigenvalues and eigenvectors are also discussed. 

The paper is constructed as follows: in Section 2 we give the definition of the 
Google matrix of integers, the properties of its PageRank vector are analyzed in Section 
3, the analysis of spectral properties is given in Section 4, the analytical expressions 
for the PageRank vector are presented in Sections 4,5 and the discussion of the results 
is presented in Section 6. 

2. Google matrix of integers 

The elements of the Google matrix G(a) of a directed network with N nodes are given 

by 

G mn (a) = aS mn + (1 - a)/N . (1) 

Here the matrix S is obtained by normalizing to unity all columns of the adjacency 
matrix A mn , and replacing the elements of columns with only zero elements, 
corresponding to dangling nodes, by X/N. An element A mn of the adjacency matrix 
is equal to unity if a node n points to node m and zero otherwise. The damping 
parameter a in the WWW context describes the probability (1 — a) to jump to any 
node for a random surfer. The value a — 0.85 gives a good classification of pages for 
WWW [8] . The matrix G belongs to the class of Perron- Frobenius operators [8] , its 
largest eigenvalue is A = 1 and the other eigenvalues obey |A| < a. In typical WWW 
networks the eigenvalue A = 1 is strongly degenerate at a = 1 (see e.g. [5]) and the 
introduction of a < 1 becomes compulsory to define a unique right eigenvector at A = 1 
and to ensure the convergence of the PageRank vector by the power iteration method 
[5]. The right eigenvector at A = 1 gives the probability P(n) to find a random surfer 
at site n and is called the PageRank. Once the PageRank is found, all nodes can be 
sorted by decreasing probabilities P(n) and an increasing index K(n). The node rank 
is then given by index K(n) which reflects the relevance of the node corresponding to 
a positive integer n. The PageRank dependence on K is well described by a power law 
P(K) oc 1/K^ in with fan Ri 0.9. This is consistent with the relation (3 in = \ j{\i in — 1) 
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Figure 1. The Google matrix of integers: the amplitudes of the matrix elements 
G mn at a = 1 are shown by color with blue for minimal zero elements and red 
for maximal unity elements, with 1 < n < N corresponding to x— axis (with 
n = 1 corresponding to the left column) and 1 < m < N for y— axis (with m = 1 
corresponding to the upper row). The matrix sizes are N = 31 in the left panel 
and N = 101 in the right panel. 



corresponding to the average proportionality of PageRank probability P(n) to its in- 
degree distribution Wi n (k) oc l/k IJ ' in where k(n) is a number of ingoing links for a 
node n [5]. For the WWW it is established that for the ingoing links /j,„ «2.1 (with 
Pin ~ 0.9) while for out-degree distribution w out of outgoing links a power law has the 
exponent fi out w 2.7 jTDl [TT] . Here we analyze properties of PageRank and use the 
notation (3 — ft in . 

To construct the Google matrix of integers, we define for to, n € {1, . . . , N} the 
adjacency matrix by A mn — k where the k is a "multiplicity" defined k as the largest 
integer such that m k is a divisor of n and if 1 < m < n, and k — if to = 1 or to = n 
or if m is not a divisor of n. Thus we have k = if m is not a divisor of n and k > 1 
if to is a divisor of n different from 1 and n. The total size N of the matrix is fixed 
by the maximal considered integer. 

This defines a network where an integer number n is linked to its divisors to 
different from 1 and n itself and where the transition probability is proportional to 
the multiplicity k, the number of times we can divide n by to. The number 1 and 
the prime numbers are therefore not linked to any other number and correspond 
to dangling nodes in the language of WWW networks. For example, the number 
n = 24 has links pointing to m{k) = 2(3), 3(1), 4(1), 6(1), 8(1), 12(1) (multiplicity 
is given in the brackets) so that the nonzero matrix elements in this column are 
3/8, 1/8, 1/8, 1/8, 1/8, 1/8 respectively. We find the total number of links N e = 
Tlmm-Amm taking into account the multiplicity, to be Nt = 6005 at TV = 1000, 
N/, = 1066221 at N = 10 5 , N t = 152720474 at N = 10 7 , N e = 19877650264 at 
N = 10 9 . The fit of the dependence N e = N {a e + b e In TV) gives a e = -0.901 ± 0.018, 
b e = 1.003 ±0.001. 

From the adjacency matrix A we first construct a matrix Sq by normalizing 
the sum in each column, containing at least one non-zero element, to unity and the 
matrix S is obtained from So by replacing the elements of columns with only zero 
elements, corresponding to dangling nodes 1 and prime numbers, by I/TV. The Google 
matrix G is finally obtained from S by Eq. ([!]) for an arbitrary damping factor. The 
PageRank is the right eigenvector of the matrix G with the maximal eigenvalue A = 1: 
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10° 10 2 10 4 10 6 

K 

Figure 2. Dependence of PageRank probability P(K) on PageRank index K for 
the matrix sizes N = 10 3 , 10 4 , 10 5 , 10 6 , 10 7 ; the dashed straight line shows the 
Zipf law dependence P ~ 1/K. 

GP = XP = P. 

The examples of the Google matrix G at a = 1 for N = 31, 101 are shown in 
Fig. [I] We see that most elements are concentrated above the main matrix diagonal 
since the divisors m are smaller than the number n itself. The only exceptions are given 
by the columns at 1 and the prime numbers p which have no divisors (apart from 1 and 
p) and hence they correspond to the dangling nodes with no direct links pointing to 
them. The amplitude of the elements in these columns is uniformly 1 /N. The structure 
of the matrix clearly shows the presence of diagonals m = n/2, n/3, . . . corresponding 
to the small divisors ml = 2, 3, which appear rather often in the division of 
integers. This structure is preserved up to the largest size N = 10 9 considered in this 
work. 

As we will see in Section|3J the eigenvalue Ao = 1 of the matrix S is non-degenerate 
(contrary to typical realistic WWW networks [9]) and in addition its spectrum has a 
large gap with Ao and the other eigenvalues |Aj| < 0.6. In such a case the PageRank 
vector P(K) has a very small variation when the damping factor a is changed in 
the range 0.85 < a < 1 and the convergence of the power method to calculate the 
PageRank is well assured, actually quite fast, even for the damping parameter a = 1. 
Therefore, we limit in this work our studies to the case a = 1 at which G coincides 
with the matrix S and from now on we denote S as "the Google matrix" . 

3. PageRank order of integers 

We first determine the PageRank vector of the Google matrix numerically by the 
power iteration method [8] or by the Arnoldi method [12] using an Arnoldi dimension 
of size riA, which allows to find several eigenvalues and eigenvectors with largest |A| 
for a full matrix size of a few millions (see more details in [51 113j). 

The dependence of PageRank probability P(K) on PageRank index K is shown 
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Figure 3. Dependence of PageRank probability P on the integer number n for 
matrix sizes N = 10 6 , 10 7 (left panel green and red points respectively), and 
rescaled probability nP on n (right panel); data are shown in log- log scale. 




Figure 4. Dependence of the integer number n on the PageRank index K for 
sizes N = 10 5 , 10 6 (left panel green and red points respectively), and 10 7 (right 
panel); data are shown in log- log scale. 

in Fig. [2] We see that with the growth of the system size N the dependence P{K) 
converges to a fixed distribution P{K) on initial K < N/10 values with the tail of 
distribution P(K) at K > TV/10 which is sensitive to the cut-off at the finite matrix 
size N. In the convergent part a formal fit (for 10 < K < 10 5 ) gives the dependence 
P ~ A/K 13 with In A = 0.0431 ± 0.00049, (5 = 1.040 ± 0.0015 being close to the Zipf 
law with = 1 [13] . The small value of 0—1 indicates that there can be a logarithmic- 
correction. Indeed, the fit 1/(PK) = a\ + b\ \nK (for 10 < K < 10 3 ) gives the values 
ai = 16.050 ± 0.187, b x = 2.468 ± 0.036. Thus, it is possible that in the limit of 
N —> oo we have the asymptotic behavior P ~ 1/ (K In K) . Such a scaling looks to 
be more probable due to usual logarithmic corrections in the density of primes [2]. 
However, for the available finite matrix sizes the regime of linear bevahior of 1/(PK) 
versus In K is quite limited and it is not obvious to distinguish between the above two 
fitting dependencies. 

The dependence of PageRank probability P on integer index n is shown in Fig. [3] 
It is characterized by a global decay P oc 1/n with the presence of various branches 
which are especially well visible for the rescaled quantity nP. This structure is 
preserved with the increase of matrix size for the values of n < N/100. The direct 
check shows that the highest plateau corresponds to the prime numbers p. 

Another way to analyze the structures visible in Fig. [3] is to consider the 
dependence of n on the PageRank index K obtained from the PageRank probability 
P(K n ). In fact K gives a new order of integers imposed by the PageRank. The 
dependence n{K) is shown in Fig. Hon a large scale. In a first approximation we find 
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Figure 5. Top panels: dependence of the integer number n on PageRank index 
K for size N = 10 7 shown by red points (left panel), right panel shows zoom of 
data in a rectangle from left panel. Bottom panels: in addition to data of top 
right panel data for TV = 10 6 are shown (left panel), right panel shows zoom of 
data in a rectangular region from left panel. Data are shown in usual scale. 



the layered structure with a sequence of parallel lines naif. This global structure is 
preserved with the increase of the matrix size from N = 10 5 to 10 7 . 

A more detailed view of this structure is shown in Fig. [5] There are well defined 
separated branches with approximately linear dependence n kK with k ~ 4.5 for 
the highest branch which corresponds to the highest plateau in Fig. [3] (right panel) . 
This branch contains only primes. The lower branch contains semi-primes (product 
of two primes) and so on down to smaller an smaller values of k. The whole structure 
looks to have a self-similar structure as it shows a zoom to a smaller scale. The increase 
of the size N gives some modifications of the structure keeping its global pattern (see 
Fig. [5] bottom panels). There is a certain clustering on the (n, K) plane of rectangles 
containing close values of K and integer numbers n. The rectanglers in the upper 
prime-branch contain exclusively prime numbers for n = p. Note that the neighboring 
non-prime values appear in other rectanglers on the right side for larger values of K. 
For example, in the bottom left panel of Fig. [5] we have a rectangle at K ~ 2.6 x 10 4 
and n ~ 10 5 with primes but there is at K ~ 7 x 10 4 another rectangle of semi-primes, 
also with values n ~ 10 5 . 

The direct analysis shows that the rectangles correspond to flat plateaux with 
degenerate values of P(K n ) which appear for finite matrix size N. This degeneracy 
results from only rational numbers appearing in the elements of the Google matrix and 
from its very sparse structure. Inside such flat regions the ordering in K is somewhat 
arbitrary and depends on the precise sorting algorithm used. The K index shown in 
Fig. [5] was obtained by the Shellsort method that may indeed produce a quite random 
ordering for degenerate values thus generating the rectanglers seen in Fig. [5] We have 
verified that when using a modified sorting algorithm with a secondary criterium, to 
sort with increasing n inside a degenerate region, the rectangles are replaced by lines 
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Figure 6. Dependence of the ratio n/K on the PageRank index K for size 
N = 10 7 ; data are shown in semi-log scale. The straight line shows the fit 
dependence n/K = a 2 + b 2 In K for the upper branch in the range 10 < K < 10 4 
with a 2 = 1.3583 ± 0.0099, b 2 = 0.3227 ± 0.0014. 




Figure 7. Dependence of \AK\ = \K n (N2) — K n (N\)\ on the integer n for matrix 
sizes iVi = 10 6 ,Af 2 = 10 7 (green points) and JVi = 10 5 ,7V 2 = 10 6 (red points). 
Left and right panels show the same data either in normal or in log-log scales. 

from the left bottom corner to the right top corner. With increasing values of N 
these rectangles are reduced in size. We numerically find that the first degenerate 
plateau appears at K — Kd and that this number increases with the matrix size N, 
e.g. K d = 27 at N = 1000, 177 at 10 5 , 1287 at 10 7 , 10386 at 10 9 . This dependence 
is well described by the fit K d = a d K bd with a d = 1.284 ± 0.078, b d = 0.432 ± 0.004. 
We return to the discussion of the convergence at large N a bit later. 

Since we find an approximate linear growth of n with K inside each branch it is 
useful to consider the dependence of the ratio n/K on K which is shown in Fig. [6] The 
upper branch of primes is well described by the dependence n/K = 0.322 In K + 1.358 
that shows that in the previous relation k is not a constant but grows logarithmically 
with K. We have an approximate relation b-i = 0.322 w l/b\ = 1/2.468. The lower 
branches also have an approximately logarithmic growth of the ratio n/K with K. 

Finally, let us discuss the stability of the PageRank order of integers in respect 
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to the variation of the matrix size N. The dependence P{K) is definitely converging 
to a fixed function for K <C N as it is well seen in Fig. [2j However, for a fixed integer 
n its PageRank index K n has a visible variation with the increase of matrix size N. 
These variations are visible in Fig. [5] (bottom panels). At the same time the global 
structure of the K n or n(K) dependence shows signs of convergence with the growth 
of N. A more detailed analysis of variation of AK = \K n (Ni) — K n (N2)\ for two 
matrix sizes N2 — lOiVx is shown in Fig. [7] We see that there is a significant decrease 
of variations AK with increase of N\ , even if a small changes of K n values are visible 
even at relatively low n ~ 100. On the basis of these data we make a conjecture that in 
the limit of N — > 00 we will have a convergence to a fixed PageRank order of integers 
K n . However, we expect that this convergence is very slow, probably logarithmic in 
N, thus being the reason that even at N — 10 7 we find some variations in K n . We 
note that the density of states of Riemann zeros also shows very slow convergence so 
that enormously large values of n ~ N ~ 10 20 are required to obtain stable results 

mm- 

4. Spectral properties of the Google matrix of integers 

4.1. Arnoldi method 

To study numerically the spectrum of the Google matrix S — G of integers at a = 1 
we first employ the Arnoldi method |12[ 1 1 3] . This method uses a normalized initial 
vector £0 and generates a Krylov space by the vectors S J £0 for j = 0, . . . , ua — 1 
where ha is called the Arnoldi dimension. Using Gram-Schmidt orthogonalization 
one determines an orthogonal basis of the Krylov space and the matrix representation 
of S in this basis. This provides a matrix S of modest dimension ua of Hessenberg form 
which can be diagonalized by standard QR-methods and whose eigenvalues, called Ritz 
eigenvalues, are in general very accurate approximations of the largest eigenvalues of 
the original (very large) matrix S. 

In this work we have used the Arnoldi dimension tia = 1000 and two different 
initial vectors, first a random initial vector and second a uniform initial vector with 
identical components 1/y/N (thus normalized by the Euclidean norm ||(---}||2)- The 
spectrum of the matrix S is shown in Fig. [8] for two sizes N — 10 6 , 10 7 . We see that 
there are only three eigenvalues within the ring 0.05 < |A| < 0.5 while the majority 
of eigenvalues is concentrated inside a range of |A| < 0.05. The first few largest 
eigenvalues are accurately obtained from both initial vectors used for the Arnoldi 
method and also coincide (up to numerical precision) with the eigenvalues determined 
by a semi-analytical approach (see below). However for the range |A| < 0.05 the 
situation becomes more subtle as it is discussed below. 

We note that Fig. [8] shows a large gap between A = 1 and the next eigenvalue 
thus justifying our above choice of the damping factor a = 1 . 

4-2. Analytical discussion of spectrum 

The Google matrix S at a = 1 has a very particular structure which allows to establish 
some important properties for the spectrum and its eigenvalues. We can write 

S = S + vd T (2) 

where v and d are two vectors of size N with components v n = 1/N and d n — 1 for 
prime numbers n = p or n = 1 and d n = for the other non-prime numbers (different 
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Figure 8. Spectrum of the Google matrix of integers for the matrix size N = 10 
(left panels) and 10 7 (right panels); the red crosses (light blue squares) are 
numerical data from the Arnoldi method with Arnoldi dimension nj^ = 1000 
and a random initial vector (with the unit initial vector) and the dark blue points 
are the exact eigenvalues obtained as the zeros of the reduced polynomial of Eq. 
|6j. Top panels show the whole spectrum and the bottom panels show a zoom of 
the region represented by black squares in the top panels. The eigenvalues have 
significantly higher accuracy for the Arnoldi method with unit initial vector. The 
unit circle |A| = 1 is shown in green. 



from 1). For later use we also introduce the vector e with components e„ = 1 and 
therefore v = e/N. In addition d T denotes the transposed line vector of d. The matrix 
Sq is the contribution that arises from the adjacency matrix A by normalizing the non- 
vanishing columns of the latter and the tensor product v d T represents the values 1 /N 
which are put in the zero columns of So when constructing the full matrix S. The 
normalization condition of the non-vanishing columns of Sq can formally be written 
as e T So = e T — d T which is just the line vector with components for vanishing 
columns of Sq (for prime numbers n or n — 1) and 1 non- vanishing columns of So (for 
the other non-prime numbers different from 1). This expression provides the useful 
identity : 

d T = e T (l-S ) . (3) 

Furthermore we observe that the matrix So has a triagonal form with vanishing 
entries on the diagonals because (So) mn =/= only if m is a divisor of n different 
from 1 and n and therefore for any non-vanishing matrix element (So) mn we have 
m < n/2 < n. This matrix structure can also be seen in Fig.[l] As a consequence Sq 
is nilpotent with S l Q = for some integer I. In the following let us assume that I is the 
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minimal number such that S = 0. Obviously in our model I — [log 2 (iV)] is actually 
a very modest number as compared to the full matrix size N. 

We now discuss how the form of Eq. ([2| affects the eigenvalues of the full matrix 
S. Let ip be a right eigenvector of S and A its eigenvalue : 

JV 

A</> = Sip = S ip + Cv , C = d T ip = ^ n ■ ( 4 ) 

n prime or n—1 

If C = we find that ip is an eigenvector of So- Then A = since the matrix So 
is nilpotent and cannot have non-vanishing eigenvalues. The matrix Sq is actually 
non-diagonalisable and can only be transformed to a Jordan form with quite large 
Jordan blocks and as diagonal element of each of the Jordan blocks. 

Suppose now that C ^ implying that A ^ since the equation Soip = —Cv 
does not have a solution for ip because So has many zero rows and v n = 1/N ^ for 
each n — 1, . . . ,N. Since A ^ the triagonal matrix Al — Sq is invertible and from 
Eq. Q we obtain : 

^ = C(Al-5 )- 1 U = ^(^)\ . (5) 

3=0 v ' 

Note that the sum is finite since S\ = 0. The eigenvalue A is determined by the 
condition that this expression of %p has to satisfy the condition C = d T ip- Multiplying 
this condition by X 1 jC we find that A is a zero of the following reduced polynomial of 
degree /: 

l-i 

V r {\) = A' - A i_1_j cj ■ = , c 3 =d T S J v . (6) 

3=0 

This calculation shows that there are at most I eigenvalues A ^ of S given as the 
zeros of this reduced polynomial. 

We note that using S l Q — and the identity ^ one finds that the coefficients Cj 
obey the following sum rule : 

l-i / i-i 

E c ^ =dT E^o I v = e"'(l-So)(l-So)- 1 v = l (71 

3=0 \j=0 

since e T v = ^2 n v n = 1. This sum rule ensures that A = 1 is a zero of the reduced 
polynomial and the PageRank as the eigenvector of A = 1 is obtained from ^ : 

P = C^Siv , C- 1 =Y,e T Slv (8) 

3=0 j=0 

where the identity for C _1 is due to the normalization of P. 

Since the degree I = [log 2 (AT)] of the reduced polynomial is very modest: 
9 < / < 29 for 10 3 < N < 10 9 , we have determined numerically the coefficients 
Cj which only requires a finite number of successive multiplications of So to the initial 
vector v and determined the zeros of the reduced polynomial by the very efficient 
Newton-Maehly method in the complex plane. The resulting I eigenvalues (and the 
trivial highly degenerate eigenvalue A — of S) obtained from this semi-analytical 
method are also shown in Fig. [5J 



PageRank of integers 



11 



The numerical determination of the zeros shows that they are all simple zeros 
of the reduced polynomial but at this point we are not yet sure that they are also 
non-degenerate as far as the full matrix S is concerned. In theory we might still have 
principal vectors <j> associated to some eigenvalue A ^ such that S(f> = X<fi + ip with 
ip being the eigenvector at A. However, we can exclude this scenario by determining 
the full characteristic polynomial of S: 

V S (X) = det(Al- S -vd T ) 



= A JY dct(l - So/A) det [l - (1 - S / Xy 1 v d T / X] 



[l-d T (l-S /Xy 



v/X] = X 



N-l 



Vr(X) 



(9) 



since det(l — So/A) = 1, det(l — uw ) = (1 — w u) for arbitrary vectors u and 
w, and the matrix inverse has been expanded in a finite sum in a similar way as in 
Eq. ([5]). According to Eq. ([9| we observe that the simple zeros of V r {X) are also 
simple zeros of Vs(X) and have therefore an algebraic multiplicity equal to one. This 
proves that there are no principal vectors and no non-trivial Jordan-Block structure 
for A / 0. On the other hand the eigenvalue A = has algebraic multiplicity N — I 
with many large Jordan-Blocks. 

The /-dimensional subspace associated to the eigenvalues A 7^ is according to 
Eq. ([5]) generated by the / vectors v^) = S^v with j = 0, . . . , I — 1 which form a 
basis of this subspace. Using Eqs. ^ and ([6]), we may easily determine the matrix 
representation of S with respect to this basis by: 



(*■) 



0, 



.,1-1(10) 



fc=0 



where for simplicity of notation for the case j = I— 1 we write v"' = 0. The / x /-matrix 
S has the explicit form : 





t 


Co 


ci • 


' C;_ 2 


Cl-l 






1 


• 








s = 







1 • 










V 





• 


1 






(11) 



/ 



One easily verifies that the characteristic polynomial Vg(X) of this matrix coincides 
with the reduced polynomial (|6| and its / eigenvalues are therefore exactly the / non- 
vanishing eigenvalues of the full matrix S. Using the sum rule ^ one notes that the 
/-dimensional vector (1, l) T isa right eigenvector of S with eigenvalue A = 1 thus 
confirming the PageRank expression P oc X) 7 =o v ^ t see a ^ so Eq- ( 



A direct numerical diagonalisation of the matrix (111 is tricky and fails to produce 
the smaller eigenvalues (below 1CU 2 ) due to numerical rounding errors since the 
coefficients Cj decay very rapidly, e. g. C22 ~ 10 -38 for TV = 10 7 with / = 23. However, 
we may numerically diagonalize the "equilibrated" matrix : p^ 1 S p which has the 
same eigenvalues as S and where p is a diagonal matrix with diagonal matrix elements 
pjj = l/cj-%. The eigenvalues obtained from the equilibrated matrix coincide very 
precisely (up to numerical precision 10~ 14 ) with the zeros obtained from the reduced 
polynomial by the Newton-Maehly method. In Fig. [HJ we also show these / zeros 
for iV = 10 6 and N = 10 7 . Apparently, both variants of the Arnoldi method fail to 
confirm the analytical result that there are only / non- vanishing eigenvalues, a point we 
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Figure 9. Loft panel: dependence of 7, = 21n|A,-| on the index j for the / non- 
vanishing eigenvalues of S and various matrix sizes N. Right panel: dependence 
of 71 on (ln7V)~ 1 (red line with crosses). The green line corresponds to the fit 
71 (JV) = 71(00) + A7/I11JV for the range 10 5 < N < 10 9 (i.e. (IniV)- 1 < 0.09) 
with 71(00) = 1.020 ± 0.006 and A7 = 7.14 ± 0.09. 



attribute to the numerical instability of the highly degenerate and defective eigenvalue 
A = and which we will discuss below. 

To study the evolution of the eigenvalue spectrum with N it is actually convenient 
to introduce the variable jj = — 2 In | Aj | . The dependence on jj on the index j is 
shown in left panel of Fig. [9] It appears that the 7-spectra for different values of N 
fall roughly on the same curve except for the last one or two values of each spectrum. 
This universal curve can be roughly approximated by a piecewise linear function with 
two slopes w 4/3 for < j < 6 and « 1/7 for 6 < j < 28. 

We note that the convergence of the first nonzero 71 is compatible with the law 
7i (N) « 7i(oo) + A7/lniV with 71(00) = 1.020±0.006 and A7 = 7.14±0.09 obtained 
from a fit in the range 10 5 < N < 10 9 . This fit is actually very accurate as can be 
seen from the small error of 71(00) and the right panel of Fig. [9] Once more, such a 
dependence indicates a very slow logarithmic convergence with the system size N. 



In Fig. 10 we show the amplitude of the second eigenvector ipi at 

Ai = —0.28422 + i 0.38726 for N = 10 7 versus the K index. Despite some fluctuations 
this eigenvector seems to be close to the PageRank as far as the overall distribution of 
very large and small values is concerned. This behavior does not come as a surprise 
in view of the expansion [see Eq. ([5])] : 

(-1 

ipi oc ^2 ^i 31 v ^ ■ (12) 
3=0 

In principle the fact that |Ai| is well below 1 indicates that the contributions of v^' for 
larger values of j increase. However, as we will discuss in the next section, the overall 
size of decays with increasing j much faster than the increase by the factor X ± J 1 
and therefore mainly the first few terms of this sum contribute to ip\ in a similar way 
as for the PageRank (see Section [5|. 

Finally in Fig. [lOj also the numerical difference of the PageRank determined by 
the standard power method and the semi-analytical expression ([8]) is shown. The 
relative difference is ~ 10~ 10 for the full range of K thus numerically confirming the 
accuracy of Eq. ((8|) . 
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K 

Figure 10. Dependence of the PageRank vector P (red curve) and the eigenvector 
1^1 1 (blue crosses) on PageRank index K for N = 10 7 . Here the eigenvalue is 
Ai = -0.28422+ i 0.38726 (|Ai| = 0.48037, 71 = 1.4663; and the corresponding Vi 
is normalized by the condition |i/>i(n)| = 1); green curve shows the difference 
|AP| between the numerically computed PageRank P (red curve) and semi- 
analytical computation of PageRank; for clarity |AP| is multiplied by a factor 
10 s . 



4-3. Numerical problems due to Jordan blocks 

The question arises why the Arnoldi Method for both initial vectors, random and 
uniform, (and also direct numerical diagonalization for small matrix sizes N < 10 4 ) fail 
to confirm the analytical result that there are only I = [log 2 (iV)] non-zero eigenvalues 
A 7^ of S. The reason is that the big subspace of dimension N — I associated to the 
eigenvalue A = with a lot of large Jordan blocks is numerically very problematic. 
This effect for such a defective eigenvalue is well known in the theory of numerical 
diagonalization methods [H] ■ To understand this a bit clearer consider a "perturbed" 
Jordan block of size D: 
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1 • 
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• 
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V e 
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/ 



(13) 



which has a characteristic polynomial X D — (— 1) £ and therefore complex eigenvalues 
that scale as |A| ~ e x l D as a function of the perturbation e while for e — we have 
A = with multiplicity D. Therefore a value of e ~ 10~ 15 due to numerical rounding 
errors may still produce strong numerical errors in the eigenvalues if D is sufficiently 
large. In our case Fig. [8] shows that the eigenvalues obtained by the Arnoldi method 
are accurate for |A| > 1CP 2 . 

As can be seen in Fig. [8] there is also a difference in quality between the two 
initial vectors chosen for the Arnoldi Method. Using a random initial vector the 
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Arnoldi method produces some wrong isolated eigenvalues in the intermediate regime 
0.01 < |A| < 0.02 and in the case N — 10 7 some of the semi-analytical eigenvalues 
in the same regime are not accurately found. However, for the uniform initial vector 
the Arnoldi method produces rather accurate eigenvalues even for |A| « 0.005. The 
reason is that the uniform initial vector corresponds (up to normalization) to the vector 
v = e/N. In view of Eq. ( 10 ) the Arnoldi method generates, at least in theory, exactly 
the ^-dimensional subspace spanned by the vectors v^' and should exactly break off at 
ua = I with a vanishing coupling matrix element from the subspace to the remaining 
space. However, due to numerical rounding errors and the fact that the vectors are 
badly conditioned, i.e. mathematically there are linearly independent but numerically 
nearly linearly dependent, the coupling matrix element is of order 10 -3 (for N — 10 7 ). 
As a consequence the Arnoldi method continues to generate new vectors producing a 
cloud of "artificial" eigenvalues inside a circle or radius ~ 0.005. These eigenvalues 
are generated by the above explained mechanism of perturbed Jordan blocks. 

The Arnoldi method with a random initial vector produces a similar, slightly 
larger cloud, of such artificial eigenvalues but here, even without any numerical 
rounding errors, the method should not break off due to a bad choice of the initial 
vector and actually it even produces some "bad" eigenvalues outside the Jordan block 
generated cloud. 

We mention that it is possible to improve the numerical behavior of the Arnoldi 
method with uniform initial vector by the following "tricks" : first we chose a different 
matrix representation of S where the first basis vector (associated to the number "1") 
is replaced by the uniform vector e and second where the scalar product used for the 
Gram-Schmidt orthogonalization is modified with stronger weights ~ n 2 for the larger 
components. This modified Arnoldi method produces a very small coupling matrix 
element ~ 10~ 10 (for N = 10 7 ) at ua = I and numerically very accurate eigenvalues 
(up to 10~ 10 ) for all I non-vanishing eigenvalues. If we force to continue the Arnoldi 
iterations (tia ^> we obtain again a Jordan block generated cloud of eigenvalues 
but whose size is considerably reduced as compared to both original variants of the 
method. 



5. Self-consistent determination of PageRank and analytic approximation 

The eigenvalue equation of the PageRank: P = C ' v + So P with C — d T P [see Eq. 

can be interpreted as a self-consistent equation for P defining a very effective 
iterative method to determine P in a few number of iterations. Let us define the 
following iteration procedure : 

p(°)=0 , P u+1) = Cv + S P u) , j = 0, 1, 2, ... . (14) 

In principle the constant C — d T P is only obtained once the exact PageRank is 
known. Therefore in a practical application of this iteration, one first chooses some 
arbitrary non- vanishing value for C and normalizes the PageRank once the procedure 
has converged. However, for reasons of notations we chose to keep the value C = d T P 



in Eq. (14) from the very beginning. 



We note that the iteration ( 14 ) can formally be solved by the sum : 

P <J) =C J2 S i )V = C Y j »W . (15) 
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Figure 11. Decay of the quantity Sj = ||P' J ' — P||i repr esen ting the error of 
the approximate PageRank pW) after j iterations of Eq. |l4| l (for N = 10 7 ). 
The left panel shows Sj versus j and the green line is obtained from the fit: 
Info-) = a z -b 3 j-c z j 2 witha 3 = 1.6±Q.4, b 3 = 1.48±0.08 and 63 = 0.117±0.004. 
The right panel shows — ln(<5j) versus j and the green line is obtained from the fit: 
ln[- ln(5j)] = a 4 + b 4 j for j > 12 with a 4 = 2.46 ± 0.03 and 64 = 0.092 ± 0.002. 
Note that both panels use a logarithmic representation for the vertical axis. 



Since S l Q = for I = [log 2 (iV)J the iteration not only converges but it actually provides 
the exact PageRank P — P"' after a finite number of iterations when j = I and in 
which case Eq. (15) coincides with our previous result 

We mention that the power method, where one successively multipli es t he matrix 
S = v d T + So to an initial (normalized) vector is somewhat similar to (14 1 but with 
a very crucial difference. In the power method the constant C is updated at each 
iteration according to C^> = d T P^' and here the initial vector must be different 
from 0. We remind that the power method converges exponentially with an error 
~ I Aip where Ai being the second eigenvalue of S with |Ai| « 0.5 for N = 10 9 and 



an extrapolated value |Ai| sa 0.6 in the limit TV — > 00. As can be seen in Fig. 11 the 



iteration (14) actually converges much faster than |Ai| J which is simply due to fixing 
the constant C from the beginning and not updating it with the iterations. 

The norm Sj = \\PV' — P\\i of the error vector after j iterations decays much 



faster than exponentially with j as it is shown in Fig. 11 For N = 10 7 one can 
quite well approximate the error norm by the fit : Sj rs exp(1.6 — 1.48 j — 0.117j 2 ) 
representing a quadratic function in the exponential. Furthermore, for j close to I 
we have the approximate ratio Sj/5j-i « 10 -2 and not 0.5-0.6 as the power method 
would imply. For j > 12 one can actually identify a regime of supcrconvergcncc where 
the logarithm of the error behaves exponentially : — ln(<5j) ~ exp(2.46 + 0.092 j) but 
the parameter range for j is too small to decide if there is really superconvergence. 
However, both fits clearly indicate that the convergence is considerably faster than 
exponential. 

As a consequence of the very rapid convergence dependent on the required 
precision, it is sufficient to apply the iteration ( 14 ) only a few number of times j -C I 
to obtain a reasonable approximation. For example, Fig. 12 shows for N = 10 7 that 
on a logarithmic scale P^ 3 ' and P are already very close. 

This allows to obtain a very simple analytical approximation of the PageRank : 
P w p( 3 ) = + + «( 2 ). For this let us rewrite the recursion = So in 

a different way : 

[N/n 



if n>_2 an, *«> - , (16, 



m=2 
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Figure 12. Left panel: Comparison o f th e first three PageRank approximations 
pM for j = 1, 2, 3 obtained from Eq. ( |l4| and the exact PageRank P versus the 
PageRank index K . Right panel: Comparison of the dependence of the rescaled 
probabilities nP and n?' 3 ) on n. Both panels correspond to the case N = 10 7 . 



where for given two integers n and m > 1 the multiplicity M(n, m) is the largest 
integer such that divisor of n and Q(n) = X^m=2 M(n, m) is the number 

of divisors of n (different from 1 and n itself) counting divisors several times according 



to their multiplicity. The appearance of the multiplicity M(mn,n) in (16 1 is not very 
convenient for numerical evaluations. Either one recalculates the multicity at each 
use or one sacrifices a big amount of memory to store them. It is actually possible to 



rewrite Eq. ( 16 ) in a way that the multiplicities no longer appear explicitely. For this 
we note that the case M(mn,n) > 2 implies only to those values of m such that n is 
a divisor of m implying m = rhn and mn = mn 2 . This produces a second sum where 
one uses the multiples of n 2 and in a similar way a further sum with multiples of n 3 



for the cases M(mn,n) > 3 and so on. For n > 2, we may therefore rewrite Eq. (16) 
in the following equivalent expression : 

[N/n] n"<N [N / 

mn 1 



v u+i) = y 1 v u) + y L _ v U) v r 17) 



where each term in the sum of v takes into account for the contributions with 
M{mn, m) = v. Note that the extra sums start at m = 1 since n > 2 and therefore 
mn 1 ' > n even for m = 1. The above PageRank iteration (14) can be written in 
a similar way (see below) but for practical purposes, numerical or analytical, it is 
actually more convenient to use the recurrence for the vectors and to add them 



to obtain the PageRank according to Eq. (15) 



Both Eqs. (16) and (17) are also very efficient for a numerical evaluation, 
especially in terms of memory usage, since the matrix Sq is represented by "only" N 
integer values Q(n), n = 1, . . . , N which is much less than the number (~ N\nN) of 
non-zero double-precision matrix elements of Sq (even completely taking into account 



the sparse structure of So). When using Eq. (16) one can recalculate at each time 
the multiplicities M[n, m) which is not very expensive. However, it turns out that 
the additional sums in Eq. |T7]) are slightly more effective than this recalculation. 



Furthermore, for the iteration of the number of non- vanishing elements is reduced 



by a factor of two at each iteration. As a consequence we may replace in Eqs. ( 16 ) and 



(17) N by [A^ - -?] and thus considerably reduce the computation time. We note that 
the direct iteration of P^> instead does not have this advantage. Actually, in terms 
of numerical computation time the approximation to stop after few iterations is not 
very important since in any case the higher order corrections require less computation 
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time. Using the iteration (17), we have been able to determine numerically the vectors 
and therefore the PageRank, the coefficients Cj and the resulting I = [log 2 N] non- 
zero eigenvalues of 5* for system sizes up to N = 10 9 . 

In addition, Eq. ( 16 ) allows also for some analytical approximate evaluation of 
the first vectors. The initial vector is Vn ^ = 1/N. Let us try to evaluate the next 
two vectors Vn and v^f 1 for the most important case where n is a prime number p. 
Furthermore, in the sum ( 16 ) the most important contributions arise for m also being 
a prime number q such that Q(qp) = 2 and M(qp,p) — 1 (except for the case q = p 
which we neglect) resulting in : 

[N/p] 



,(1) 



E 

q=2, prime 



l 

2N 



1 

2N 



1 



2p(bxN -bxp) 



(18) 



where n(n) rj n/ ln(n) (for n 3> 1) is the number of prime numbers b elow n. However, 
these values of at prime numbers n — p do not contribute in ( 16 1 for the next 
iteration j = 1 when trying to determine v^ 2 \ To obtain the leading contributions in 
i/ 2 ) we need Vn for n = p\ pi being a product of two prime numbers. In this case, we 
have Q{qp\P2) = 2 3 — 2 = 6 if q, p\, P2 are three different prime numbers. Assuming 
Pi 7^ P2 and neglecting the complications from the few cases q = p\ or q — p%, we find 
that : 



Pl P2 



1 

67V 



N 

P\P2 



1 



6pip 2 (InJV- lnpi - \np 2 ) 



(19) 



For the case n = p 2 , i.e. p\ = p 2 = p, we have Q(qp 2 ) = 5 (since p has multiplicity 2) 
resulting in : 

1 



1 

57V 



From (161 for j 



1 and ( 19 1 we obtain 

[N/{2p)\ 



,(2) 



1 



12N 



E 



5p 2 (lniV-2 1np) 



N 
pq 



(20) 



(21) 



Here we have reduced the sum from q < [N/p] to q < [N/{2p)\ since n([N/(pq)]) is 
non zero only for N/(pq) > 2 and therefore q < N/(2p). Now, we replace the sum 
Y]„(- • •) over the prime numbers by an integral J dqir'{q) (• • •) where 7r'(g) « l/ln(g) 
is the average density of prime numbers at q resulting in : 



,(2) 



1 



12iV 

1 

V2p 

1 

12p 



N/(2p) 

dq 7r 

N/(2 P ) ^ 

q 

ln(JV/(2p)) 



N 

pq 



ir'(q) 



(ln(N/p) - Inqj \nq 



du 



1 



In 2 



m(7V/p) 



u u 
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Figure 13. Left panel: The full lines correspond to the dependence of PageRank 
probability P(K) on PageRank index K for the matrix sizes N = 10 7 , 10 8 , 10 9 
with the PageRank evaluated from the expression (JsJ) using the efficient numerical 
method based on Eq. (17). The green crosses correspond to the PageRank 
obtained by the power method (PM) for N = 10 7 ; the dashed straight line shows 
the Zipf law dependence P ~ 1/K. Right panel: same as in left panel (without 
data from the power method) for a simplified model for the Google matrix of 
integers where all multiplicities M(n, m) are replaced by 1, i.e. n is to linked 
to its divisors m only once even if n can be divided several times by m. The 
PageRank was numerically evaluated by the same efficient method using Eqs. (8| 
and ( [Id) with M(n,m) = 1. 



From (181 and (22) we obtain the PageRank approximation at integer values 

P, « if * C(i + „<.» + „<«) » ^ (l - 1„,„2 + !5|*) (23) 

where we have assumed that N ^> p and replaced \n(N/p) — In N — hip sa In N and 
C is the same constant as used in (14). 

The important point with this expression is that it is of the form P p rs Cn/p 
where Cn is a constant depending on N. In order to compare with our above results, 
especially in Fig. [2j we have to replace p by the K index. Assuming that the K 
index is dominated by the prime numbers we have K = n(j>) ~ p/hip implying 
p ss Klnp ss K \nK thus providing the behavior P(K) m Cn/(K \nK) already 
conjectured above based on the numerical results. Concerning the numerical value of 
the constant Cat we find that, at N = 10 7 , it is roughly one order of magnitude too 
small as compared to the numerical results. 

We remind that the considerations leading to the expression (23) are based on 
a lot of assumptions and quite crude approximations, especially the replacement of 
7r(n) ~ n/\sx(n), even if n = 0(1), and we have neglected a lot of contributions from 
numbers with more factors in their prime factor decomposition which are most likely 
responsible for the reduced numerical prefactor. Furthermore, the assumption that the 
PageRank is dominated by prime numbers is not completely exact since certain non- 
prime numbers with a small number of factors intermix with larger prime numbers 
in the PageRank, thus modifying the dependence of the prime numbers on the K 
index from p ss K ln(i^) to p w K (1.36 + 0.323 In if) according to the fit in Fig. [6] 
for N = 10 . However, despite the approximations, we recover the leading parametric 
dependence of P ~ 1/(K In K). 

The PageRank dependence P{K) obtained from the expression (JsJ) using the 
efficient numerical method based on Eq. (17) is shown in Fig. [13] (left panel) for 
N = 10 7 ,10 8 ,10 9 . For N = 10 7 these data agree with the computation result by 
the Arnoldi power method with the numerical accuracy of the order of 10 -10 (see 
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Figure 14. Dependence of \AK\ = \K n {N<2) — K n (N\)\ on integer n for matrix 
sizes Ni = 10 s , iVa = 10 9 (green points) and Ni = W 7 ,N2 = 10 8 (red points). 
Left and right panels show the same data in normal and log-log scales. Note the 
strongly reduced vertical scale of the left panel as compared to the left panel of 
Fig. [jj The vertical scale of the right panel was not reduced allowing a direct 
comparison with the right panel of Fig. [7J The data was obtained by the same 
efficient numerical method as in the left panel of Fig. |13| 




also Fig. 10). This confirms the efficiency of our semi-analytical computation of the 
PageRank. 

We note that it may be useful to consider a simplified model for the Google matrix 
of integers when multiplicity of all divisors is taken to be unity. The numerical fit of 
data shows that in this case the number of links scales as Ni = N (at + be In N) with 
a e = -1.838 ±0.002, b e = 0.999 ± 0.0002. For this model we have the same expression 
(16) but with the replacements M(nm,m) — > 1 and Q(n) —> Q*{n) where Q*(n) is 
the number of divisors of the integer n excluding 1 and n itself without multiplicities, 
e. g. Q*(2) = 0, Q*(3) = 0, Q*(4) = 1, . . .. Note that this quantity is given by the 

expression Q*(n) = (j\j(pj + l)j — 2 where fij are the exponents in the prime factor 

decomposition of n = ■ pj 3 . 

The dependence of the PageRank on K for the simplified model is shown in the 
right panel of Fig. |13| It shows practically the same behavior as in the main model 
shown in the left panel. In this case the analytical expression for the PageRank P, 
obtained from the first three terms, has a very simple form 

/ W/n] 

P.-if-Jl+E^ (24) 

\ 7711— 1 

[N/n] [N/( nmi )] \ 

+ E E — — ] 

mi— 2 ?7i2—2 



*(miri) Q*(m2min) 



where N is the matrix size and ctat is the global normalization constant determined 
by the condition P n = 1. This simple formula gives a good description of the 

PageRank behavior shown in the right panel of Fig. [13] Indeed, the direct count shows 
that the ratio R ms of the total number of links Ni for both models (counted with or 
without multiplicities) approaches to unity for large matrix sizes. For example, we 
have R ms = 1.184 (N = 1000), 1.102 (10 5 ), 1.070 (10 7 ), 1.052 (10 9 ). Thus we think 
that in the limit of large N both models converge to the same type of behavior. It is 
possible that the simplified model may be more suitable for further analytical analysis. 
However, in this work we present data for the simplified model only in the right panel 
of Fig. pi 
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Using the PageRank data obtained by the self-consistent approach for large 
N = 10 7 , 10 8 , 10 9 we can analyze the convergence of the PageRank order K n at larger 
sizes compared to those used in Fig. [7j These new results for variation of \AK\ are 



presented in Fig. 14 They show that the variation \ AK\ decreases with the increase of 
./V from 10 7 up to 10 9 even if the process is slow. A direct comparison shows that the 
first deviation in the order K n appears at K = K s = 13 (comparing N = 10 6 vs. 10 7 ), 
K s = 27 (10 7 vs. 10 s ), K s = 30 (10 8 vs. 10 9 ). We find that the stable range interval 
K s grows with N but this growth seems logarithmic like with K s ~ In TV. Such 
a growth seems to be natural in the view of logarithmic convergence of the second 
eigenvalue Ai discussed above and all logarithmic factors appearing in the density of 
primes. We also note that the value of K s is significantly smaller than the value of Kd 
at which the first degenerate flat plateau appears in the PageRank P(K) and hence 
these degeneracies do not affect the order of the first K s integers. 

On the basis of the obtained results we conclude that for our maximal matrix 
size N = 10 9 we have convergence of the first 32 values of K n . These numbers n, 
corresponding to the values of K = 1, 2, . . . , 32, are n = 2, 3, 5, 7, 4, 11, 13, 17, 6, 19, 

9, 23, 29, 8, 31, 10, 37, 41, 43, 14, 47, 15, 53, 59, 61, 25, 67, 12, 71, 73, 22, 21. There 
are about 30% of non-primes among these values. We mention that the positions of the 
first non-primes 4, 6, 9 can be alrea dy o btained from the first order approximations of 
v^ 1 ' discussed above. According to ( 18 ) the relative weight of a prime number in first 
order is l/(2p). For the two square numbers 4 and 9 the weight is according to (20) 
either 1/(5 x 4) = 1/(2 x 10) or 1/(5 x 9) = 1/(2 x 22.5) explaining that 4 is between 
the primes 7 and 11 and that 9 is between 19 and 23. For the product 6 = 2 x 3 we 
have according to ( 19 1 the weight 1/(6 x 6) = 1/(2x18) implying that 6 is between 17 
and 19. However, this simple argument docs not work for other numbers, for example 
for 10 (or 14) it would imply an incorrect position between 29 and 31 (41 and 43). We 
mention that more numerical data are available at the web page |15j . 

For the simplified model we find at N = 10 9 for the first values K = 1, 2, . . . , 32 
a slightly different order of integers n = 2, 3, 5, 4, 7, 11, 13, 17, 9, 6, 19, 8, 23, 29, 31, 

10, 37, 41, 43, 14, 47, 15, 53, 25, 59, 16, 61, 12, 67, 71, 22, 21. Here the absence of 
multiplicities increases the weight for square numbers of primes to l/(4p 2 ) implying 
that these numbers are slightly advanced in the K order as compared to our main 
model. The modified weight for 9 is 1/(2 x 18) coherent with new position between 
17 and 19 (with 6 having the same first order weight as 9 and also being between 17 
and 19). For 4 the weight is increased from 1/(2 x 10) to 1/(2 x 8). However, this 
increase is not sufficient to explain the new position of 4 between 5 and 7. 

One might mention as a curiosity a special "prime integer network model" where 
a non-prime number n is only linked to its prime factors (and not to all of its divisors). 
In this case the matrix So is strongly simplified such that Sq = 0, i.e. I — 2 being 
independent of the system size and hence there are only two non- vanishing eigenvalues 
of the Google matrix which are Ao = 1 and Ai = Co — 1 ~ — 1 + 1/lniV where 
Co = (tt(N) + 1)/N fa 1/lniV is the ratio of the number of primes and unity to N. 
This is simply seen from the definition of Cj in Eq. ^ and the trace cq = Ao + Ai of 
the matrix ( [TT] ) which is of size 2 x 2 for this case. According to ^ the PageRank P 
and the second eigenvector ipi are given by P cx e + i/ 1 * 1 and ipi oc e — v^'/(l — 1/bxN) 
where e is the vector with all components equal to unity and is a vector such that 
VrP = for non-prime numbers n o r n — 1 and t>„ for prime numbers n — p is given 
by an equation similar to Eq. ( 16 ) for j ' = with Vnm being replaced by unity and 
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multiplicities and number of divisors adapted for the prime integer network model. 
Here both versions, with or without multiplicities are possible. The eigenvalues do not 
depend on the version but the eigenvectors do. For both cases it is pretty obvious that 
the K index gives exactly the sequence of prime numbers below N in increasing order 
followed by a large degenerated plateau for the non-prime integer numbers. Note that 
here the second eigenvalue converges to —1 with a correction 1/ In (AT) for large iV thus 
closing the gap in |A| of the Google matrix. 

6. Discussion 

In this work we constructed the Google matrix of integers based on links between a 
given integer n and its divisors. The numerical analysis based on the Arnoldi method 
allowed us to show that the PageRank P{K n ) of this directed network decays with 
PageRank index K n of an integer n approximately as P{K n ) ~ \/{K n h\K n ) being 
similar to those of the Zipf law and those found for the WWW. However, the spectrum 
of the Google matrix has a large gap appearing between the unit eigenvalue and other 
eigenvalues while the spectrum of the Google matrix of WWW usually has no gap. We 
developed an efficient semi-analytical method to compute the PageRank of integers 
which allowed us to determine the dependence P(K n ) up to matrix size of one billion. 
We show that the dependence of PageRank on the integer number n is characterized 
by a series of branches corresponding to primes, semi-primes and numbers with a 
higher products of primes. Our data show a logarithmic like convergence of PageRank 
order of integers to a fixed order in the limit of matrix size going to infinity. 
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